56 research outputs found

    BIRCH: A user-oriented, locally-customizable, bioinformatics system

    Get PDF
    BACKGROUND: Molecular biologists need sophisticated analytical tools which often demand extensive computational resources. While finding, installing, and using these tools can be challenging, pipelining data from one program to the next is particularly awkward, especially when using web-based programs. At the same time, system administrators tasked with maintaining these tools do not always appreciate the needs of research biologists. RESULTS: BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution. The BIRCH core distribution includes many popular bioinformatics programs, unified within the GDE (Genetic Data Environment) graphic interface. Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multiuser bioinformatics system across different platforms and operating systems. These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base. BIRCH can also act as a front end to provide a unified view of already-existing collections of bioinformatics software. Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks. CONCLUSION: BIRCH provides a versatile organizational framework for managing software and databases, and making these accessible to a user base. Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere

    Feature expressions: creating and manipulating sequence datasets.

    No full text
    Annotation of features, such as introns, exons and protein coding regions in GenBank/EMBL/DDBJ entries is now standardized through use of the Features Table (FT) language. The essence of the FT language is described by the relation 'expression-->sequence', meaning that each FT expression evaluates to a sequence. For example, the expression M74750:1..50 evaluates to the first 50 bases of the sequence with accession number M74750. Because FT is intrinsic to the database definition, it can serve as a software- and platform-independent lingua franca for sequence manipulation. The XYLEM package makes it possible to create and manipulate sequence datasets using FT expressions. FEATURES is a program that resolves FT expressions into their corresponding sequences. Annotated features can be retrieved either by feature key or by expression. Even unannotated portions of a sequence can be retrieved by user-generated FT expressions. Applications of the FT language include retrieval of subsequences from large sequence entries, generation of chromosome models or artificial DNA constructs, and representation of restriction maps or mutants

    Improving the efficiency of dot-matrix similarity searches through use of an oligomer table.

    No full text
    Dot-matrix sequence similarity searches can be greatly speeded up through use of a table listing all locations of short oligomers in one of the sequences to find potential similarities with a second sequence. The algorithm described finds similarities between two sequences of lengths M and N, comparing L residues at a time, with an efficiency of L X M X N/(SK) where S is the alphabet size, and k is the length of the oligomer. For nucleic acids, in which S = 4, use of a tetranucleotide table results in an efficiency of L X M X N/256. The simplicity of the approach allows for a straightforward calculation of the level of similarities expected to be found for given search parameters. Furthermore, the storage required is minimal, allowing for even large sequences to be compared on small microcomputers. Theoretical considerations regarding the use of this search are discussed

    Portable microcomputer software for nucleotide sequence analysis.

    No full text
    The most common types of nucleotide sequence data analyses and handling can be done more conveniently and inexpensively on microcomputers than on large time-sharing systems. We present a package of computer programs for the analysis of DNA and RNA sequence data which overcomes many of the limitations imposed by microcomputers, while offering most of the features of programs commonly available on large computers, including sequence numbering and translation, restriction site and homology searches with dot-matrix plots, nucleotide distribution analysis, and graphic display of data. Most of the programs were written in Standard Pascal (on an Apple II computer) to facilitate portability to other micro-, mini-, and and mainframe computers

    Genome Sequence Analysis of the Oleaginous Yeast, Rhodotorula diobovata, and Comparison of the Carotenogenic and Oleaginous Pathway Genes and Gene Products with Other Oleaginous Yeasts

    No full text
    Rhodotorula diobovata is an oleaginous and carotenogenic yeast, useful for diverse biotechnological applications. To understand the molecular basis of its potential applications, the genome was sequenced using the Illumina MiSeq and Ion Torrent platforms, assembled by AbySS, and annotated using the JGI annotation pipeline. The genome size, 21.1 MB, was similar to that of the biotechnological “workhorse”, R. toruloides. Comparative analyses of the R. diobovata genome sequence with those of other Rhodotorula species, Yarrowia lipolytica, Phaffia rhodozyma, Lipomyces starkeyi, and Sporidiobolus salmonicolor, were conducted, with emphasis on the carotenoid and neutral lipid biosynthesis pathways. Amino acid sequence alignments of key enzymes in the lipid biosynthesis pathway revealed why the activity of malic enzyme and ATP-citrate lyase may be ambiguous in Y. lipolytica and L. starkeyi. Phylogenetic analysis showed a close relationship between R. diobovata and R. graminis WP1. Dot-plot analysis of the coding sequences of the genes crtYB and ME1 corroborated sequence homologies between sequences from R. diobovata and R. graminis. There was, however, nonsequential alignment between crtYB CDS sequences from R. diobovata and those from X. dendrorhous. This research presents the first genome analysis of R. diobovata with a focus on its biotechnological potential as a lipid and carotenoid producer

    Genomic Comparison of Facultatively Anaerobic and Obligatory Aerobic Caldibacillus debilis Strains GB1 and Tf Helps Explain Physiological Differences

    No full text
    Caldibacillus debilis strains GB1 and Tf display distinct phenotypes. C. debilis GB1 is capable of anaerobic growth and can synthesize ethanol while C. debilis Tf cannot. Comparison of the GB1 and Tf genome sequences revealed that the genomes were highly similar in gene content and showed a high level of synteny. At the genome scale, there were several large sections of DNA that appeared to be from lateral gene transfer into the GB1 genome. Tf did have unique genetic content but at a much smaller scale; 300 genes in Tf verses 857 genes in GB1 that matched at â ¤90% sequence similarity. Gene complement and copy number of genes for the glycolysis, tricarboxylic acid (TCA) cycle, and electron transport chain (ETC) pathways were identical in both strains. While Tf is an obligate aerobe, it possesses the gene complement for an anaerobic lifestyle (ldh, ak, pta, adhE, pfl). As a species, other strains of C. debilis should be expected to have the potential for anaerobic growth. Assaying the whole cell lysate for ADH activity revealed an approximately 2-fold increase in the enzymatic activity in GB1 when compared to TfThe accepted manuscript in pdf format is listed with the files at the bottom of this page. The presentation of the authors' names and (or) special characters in the title of the manuscript may differ slightly between what is listed on this page and what is listed in the pdf file of the accepted manuscript; that in the pdf file of the accepted manuscript is what was submitted by the author

    Characterization of a single copy gene encoding ferredoxin I from pea.

    No full text
    We have isolated, mapped, and sequenced a genomic clone containing the ferredoxin I (Fed-1) gene from Pisum sativum. The gene is present as a single copy per haploid genome. It has no introns, and it specifies a 753-nucleotide transcript encoding a 149-amino acid protein including a 52-residue transit peptide. Upstream sequences from Fed-1 contain several elements with similarity to transcriptional regulatory elements from RbcS and Cab genes, and gel mobility shift assays show that nuclear extracts from light-grown pea leaves contain one or more DNA binding activities specific for Fed-1 5'-flanking sequences. RbcS and Cab regulatory sequences are only weak competitors for this binding, however, and the RbcS and Cab similarities mostly lie outside of the region essential for binding. These data are discussed in terms of previously observed physiological differences between the light responses of Fed-1 and other genes
    • …
    corecore